Accent type and phrase boundary estimation using acoustic and language models for automatic prosodic labeling

نویسندگان

Tomoki Koriyama

Hiroshi Suzuki

Takashi Nose

Takahiro Shinozaki

Takao Kobayashi

چکیده

This paper proposes an automatic prosodic labeling technique for constructing speech database used for speech synthesis. In the corpus-based Japanese speech synthesis, it is essential to use annotated speech data with prosodic information such as phrase boundaries and accent types. However, manual annotation is generally time-consuming and expensive. To overcome this problem, we propose an estimation technique of accent types and phrase boundaries from speech waveform and its transcribed text using both language and acoustic models. We use conditional random field (CRF) for the language model, and HMM for the acoustic model which has shown to be effective in prosody modeling in speech synthesis. By introducing HMM, continuously changing features of F0 contours are modeled well and this results in higher estimation accuracy than conventional techniques that use simple polygonal line approximation of F0 contours.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic prosodic labeling of accent information for Japanese spoken sentences

This paper describes a method of automatic labeling of prosodic information focusing on accent types and accent phrase boundaries for Japanese spoken sentences. They are predicted by CRF (Conditional Random Fields) using linguistic information and F0 contour information. In the prediction of the accent type, we propose a method that uses a provisional accent type predicted by linguistic informa...

متن کامل

Combining acoustic, lexical, and syntactic evidence for automatic unsupervised prosody labeling

Automatic labeling of prosodic events in speech has potentially significant implications for spoken language processing applications, and has received much attention over the years, especially after the introduction of annotation standards such as ToBI. Current labeling techniques are based on supervised learning, relying on the availability of a corpus that is annotated with the prosodic label...

متن کامل

An Intonational Phrase Boundary and Pitch Accent Dependent Speech Recognizer

Does prosody help word recognition? In this paper, we propose a novel probabilistic framework in which word and phoneme are dependent on prosody in a way that improves word recognition. We describe the idea of prosody dependent speech recognition by building a prosody dependent speech recognizer that conditions word and phoneme models on two important prosodic variables: intonational phrase bou...

متن کامل

Exploiting Acoustic and Syntactic Features for Prosody Labeling in a Maximum Entropy Framework

In this paper we describe an automatic prosody labeling framework that exploits both language and speech information. We model the syntactic-prosodic information with a maximum entropy model that achieves an accuracy of 85.2% and 91.5% for pitch accent and boundary tone labeling on the Boston University Radio News corpus. We model the acousticprosodic stream with two different models, one a max...

متن کامل

Analysis of Inconsistencies in Cross-Lingual Automatic ToBI Tonal Accent Labeling

This paper presents an experimental study on how corpus-based automatic prosodic information labeling can be transferred from a source language to a different target language. Tone accent identification models trained for Spanish, using the ESMA corpus, are used to automatically assign tonal accent ToBI labels on the (English) Boston Radio news corpus, and vice versa. Using just local raw proso...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Accent type and phrase boundary estimation using acoustic and language models for automatic prosodic labeling

نویسندگان

چکیده

منابع مشابه

Automatic prosodic labeling of accent information for Japanese spoken sentences

Combining acoustic, lexical, and syntactic evidence for automatic unsupervised prosody labeling

An Intonational Phrase Boundary and Pitch Accent Dependent Speech Recognizer

Exploiting Acoustic and Syntactic Features for Prosody Labeling in a Maximum Entropy Framework

Analysis of Inconsistencies in Cross-Lingual Automatic ToBI Tonal Accent Labeling

عنوان ژورنال:

اشتراک گذاری